Gradient-Based Training of Gaussian Mixture Models for High-Dimensional Streaming Data

نویسندگان

چکیده

Abstract We present an approach for efficiently training Gaussian Mixture Model (GMM) by Stochastic Gradient Descent (SGD) with non-stationary, high-dimensional streaming data. Our scheme does not require data-driven parameter initialization (e.g., k-means) and can thus be trained based on a random initial state. Furthermore, the allows mini-batch sizes as low 1, which are typical streaming-data settings. Major problems in such settings undesirable local optima during early phases numerical instabilities due to high data dimensionalities. introduce adaptive annealing procedure address first problem, whereas eliminated exponential-free approximation standard GMM log-likelihood. Experiments variety of visual non-visual benchmarks show that our SGD completely without, instance, k-means centroid initialization. It also compares favorably online variant Expectation-Maximization (EM)—stochastic EM (sEM), it outperforms large margin very

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two-way Gaussian mixture models for high dimensional classification

Mixture discriminant analysis (MDA) has gained applications in a wide range of engineering and scientific fields. In this paper, under the paradigm of MDA, we propose a two-way Gaussian mixture model for classifying high dimensional data. This model regularizes the mixture component means by dividing variables into groups and then constraining the parameters for the variables in the same group ...

متن کامل

Gaussian mixture models for the classification of high-dimensional vibrational spectroscopy data

In this work, a family of generative Gaussian models designed for the supervised classification of high-dimensional data is presented as well as the associated classification method called High Dimensional Discriminant Analysis (HDDA). The features of these Gaussian models are: i) the representation of the input density model is smooth; ii) the data of each class are modeled in a specific subsp...

متن کامل

High-Dimensional Clustering with Sparse Gaussian Mixture Models

We consider the problem of clustering high-dimensional data using Gaussian Mixture Models (GMMs) with unknown covariances. In this context, the ExpectationMaximization algorithm (EM), which is typically used to learn GMMs, fails to cluster the data accurately due to the large number of free parameters in the covariance matrices. We address this weakness by assuming that the mixture model consis...

متن کامل

Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models

Finite gaussian mixture models are widely used in statistics thanks to their great flexibility. However, parameter estimation for gaussian mixture models with high dimensionality can be challenging because of the large number of parameters that need to be estimated. In this letter, we propose a penalized likelihood estimator to address this difficulty. The [Formula: see text]-type penalty we im...

متن کامل

High dimensional Sparse Gaussian Graphical Mixture Model

This paper considers the problem of networks reconstruction from heterogeneous data using a Gaussian Graphical Mixture Model (GGMM). It is well known that parameter estimation in this context is challenging due to large numbers of variables coupled with the degenerate nature of the likelihood. We propose as a solution a penalized maximum likelihood technique by imposing an l1 penalty on the pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neural Processing Letters

سال: 2021

ISSN: ['1573-773X', '1370-4621']

DOI: https://doi.org/10.1007/s11063-021-10599-3